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Abstract 

In  a  previous  paper  [7]  we  introduced  a  class  of  multiscale  dynamic  models  evolving  on 
dyadic  trees  in  which  each  level  in  the  tree  corresponds  to  the  representation  of  a  signal  at  a 
particular  scale.  One  of  the  estimation  algorithms  suggested  in  [7]  led  to  the  introduction  of 
a  new  class  of  Riccati  equations  describing  the  evolution  of  the  estimation  error  covariance 
as  multiresolution  data  is  fused  in  a  fine-to-coarse  direction.  This  equation  can  be  thought 
of  as  having  3  steps  in  its  recursive  description:  a  measurement  update  step,  a  fine-to-coarse 
prediction  step,  and  a  fusion  step.  In  this  paper  we  analyze  this  class  of  equations.  In 
particular  by  introducing  several  rudimentary  elements  of  a  system  theory  for  processes 
on  trees  we  develop  bounds  on  the  error  covariance  and  use  these  in  analyzing  stability 
and  steady-state  behavior  of  the  fine-to-coarse  filter  and  the  Riccati  equations.  While 
this  analysis  is  similar  in  spirit  to  that  for  standard  Riccati  equations  and  Kalman  filters, 
there  are  substantial  differences  that  arise  in  the  multiscale  context.  For  example,  the 
asymmetry  of  the  dyadic  tree  makes  it  necessary  to  define  multiscale  processes  via  a  coarse- 
to-fine  dynamic  model  and  also  to  define  the  first  step  in  a  fusion  processor  in  the  opposite 
direction  —  i.e.  fine-to-coarse.  Also,  the  notions  of  stability,  reachability,  and  observability 
are  different.  Most  importantly  for  the  analysis  here,  we  will  see  that  the  fusion  step  in  the 
fine-to-coarse  filter  and  Riccati  equation  requires  that  we  focus  attention  on  the  maximum 
likelihood  estimator  in  order  to  develop  a  stability  and  steady-state  theory. 
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1  Introduction 

Multiscale  signal  analysis  is  presently  an  extremely  active  research  topic  due  in  large 
part  to  the  emerging  theory  of  wavelet  transforms  [8,10,11]  and  to  a  broad  array  of 
applications  in  which  multiresolution  analysis  seems  to  be  needed  or  natural.  Our 
work  in  this  area  [7,2]  has  been  motivated  by  a  desire  to  develop  multiscale  statistical 
models,  inspired  by  the  structure  of  wavelet  transforms,  and  which  could  then  pro¬ 
vide  the  foundation  for  statistically  optimal  multiresolution  processing  algorithms. 
In  particular  in  [7]  we  introduced  a  class  of  multiscale  state  models  evolving  in  a 
coarse-to-fine  direction  on  a  dyadic  tree  and  presented  several  algorithms  for  optimal 
estimation  for  these  processes,  i.e.  for  statistically  optimal  fusion  of  multiresolution 
measurements.  In  this  paper  we  take  a  much  more  careful  look  at  one  of  these  algo¬ 
rithms  and  develop  the  required  system- theoretic  concepts  for  systems  on  trees  that 
allow  us  to  analyze  and  to  understand  more  deeply  the  structure  and  properties  of 
this  class  of  multiresolution  data  fusion  algorithms. 

In  the  next  section  we  briefly  review  the  modeling  framework  introduced  in  [7] 
and  one  of  the  estimation  procedures  derived  therein.  In  particular,  as  we  discuss, 
the  wavelet  transform  makes  it  natural  to  deflne  multiscale  models  evolving  from 
coarse  to  fine  resolution  representations.  On  the  other  hand,  the  particular  estimation 
algorithm  analyzed  here  -  a  two  sweep  algorithm  in  the  spirit  of  the  Rauch- Tung- 
Striebel  smoothing  algorithm  -  must  have  as  its  first  step  a  sweep  evolving  in  the 
opposite  direction,  i.e.  from  fine  to  coarse  scales.  Furthermore  this  sweep,  which 
resembles  a  Kalman  filter  recursion  (although  now  in  scale),  has  an  additional  step 
not  found  in  temporal  processing  corresponding  to  the  fusion  of  information  as  we 
move  from  fine-to-coarse  scales. 

The  remainder  of  this  paper  then  analyzes  in  detail  the  qualitative  properties  of 
this  fine-to-coarse  filtering  step.  In  particular  our  main  results  center  on  the  stability 
of  this  step  and  the  convergence  to  steady-state.  As  we  will  see,  the  fusion  step  makes 
it  necessary  to  view  the  optimal  estimator  as  producing  a  maximum  likelihood(ML) 
estimate  which  is  then  combined  with  prior  statistics,  and  it  is  the  dynamics  of  the  ML 
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estimate  recursion  which  must  be  analyzed.  Also,  in  order  to  analyze  this  recursion 
we  need  to  develop  several  system-theoretic  notions  for  fine-to- coarse  recursions  on 
dyadic  trees.  In  particular,  in  Section  3  we  motivate  and  define  the  ML  version  of 
our  fine-to-coarse  Kalman  filter.  In  Section  4  we  develop  notions  of  reachability  and 
observability  which  we  then  use  in  Section  5  to  obtain  bounds  on  the  error  covariance 
of  the  filter.  In  Section  6  we  then  define  and  analyze  /p-stability  for  fine-to-coarse 
recursions.  As  we  will  see,  the  conditions  for  stability  depend  strongly  on  the  choice 
of  p.  In  Section  7  we  then  use  our  bounds  on  the  error  covariance  as  the  basis  for  a 
Lyapunov  proof  of  /2-stability  of  the  fine-to-coarse  filter,  while  in  Section  8  we  present 
results  on  the  steady-state  filter. 
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2  Multiscale  Stochastic  Processes  on  Trees  and 
Their  Estimation 

As  described  in  [10,11],  the  wavelet  transform  of  a  function  f{x)  provides  a  sequence 
of  approximations  of  the  signal,  at  successively  finer  scales,  consisting  of  linear  combi¬ 
nations  of  shifted  versions  of  a  single  function  (j){x)  compressed  or  expanded  to  match 
the  scale  in  question.  That  is  the  approximation  of  f(x)  at  the  mth  scale  is  given  by 

-f  CO 

fm{x)=  XI  f{m,n)(l>{2'^x -n)  (2.1) 

n=— oo 

As  we  describe  in  [7],  the  evolution  of  this  approximation  from  scale  to  scale 
describes  a  dynamical  relationship  between  the  coefficients  /(m,  n)  at  one  scale  and 
those  at  the  next.  Indeed  this  relationship  defines  a  lattice  on  the  points  (m,  n),  where 
(m  + 1,  A:)  is  connected  to  (m,  n)  if  /(m,  n)  influences  /(m-f  1,  k).  For  example  the  so- 
called  Haar  approximation,  in  which  each  /(m,  n)  is  simply  an  average  of  f{x)  over 
an  interval  of  length  2“’",  naturally  defines  a  dyadic  tree  structure  on  the  points  (m,  n) 
in  which  each  point  has  two  equally-weighted  descendents  corresponding  to  the  two 
subintervals  of  length  2“’”“^  at  the  (m  -f  l)st  scale  obtained  from  the  corresponding 
interval  of  length  2“’"  at  the  mth  scale. 

The  preceding  development  provides  the  motivation  for  the  study  of  stochastic 
processes  x(m,n)  defined  on  the  types  of  lattices  just  described.  While  we  have 
performed  some  analysis  for  the  most  general  of  these  lattices  [6],  the  work  in  [7]  and 
in  this  paper  focus  on  the  dyadic  tree.  Let  us  make  several  comments  about  this 
case.  First,  as  illustrated  in  Figure  1,  with  this  and  any  of  the  other  lattices,  the 
scale  index  m  is  time-like.  For  example  it  defines  a  natural  direction  of  recursion  for 
our  representation,  namely  a  signal  is  synthesized  via  a  coarse-to-fine  recursion.  In 
the  case  of  our  tree,  with  increasing  m  -  i.e.  the  direction  of  synthesis  -  denoting  the 
forward  direction,  we  then  can  define  a  unique  backward  shift  7“^  and  two  forward 
shifts  a  and  /I(see  Figure  1).  Also,  for  notational  convenience  we  denote  each  node 
of  the  tree  by  a  single  abstract  index  t  and  let  T  denote  the  set  of  all  nodes.  Thus  if 
t  =  (m,  n)  then  at  =  {m  + 1, 2n),  /It  =  {m  + 1, 2n  -f  1),  and  7“^  =  (m  —  1,  [|])  where 
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Figure  1:  Dyadic  Tree  Representation 

[cc]  =integer  part  of  x.  Also  we  use  the  notation  m{t)  to  denote  the  scale(i.e.  the 
m-component  of  t).  Finally,  it  is  worth  noting  that  while  we  have  described  multi¬ 
scale  representations  for  continuous-time  signals  on  (—00,  00),  they  can  also  be  used 
for  signals  on  compact  intervals  or  in  discrete-time.  For  example  a  signal  defined  for 
t  =  0, 1,  can  be  represented  by  M  scales,  each  of  which  represents  in  essence 

an  averaged,  decimated  version  of  the  finer  scale  immediately  below  it.  In  this  case 
the  tree  of  Figure  1  has  a  bottom  level,  representing  the  samples  of  the  signal  itself, 
and  a  single  root  node,  denoted  by  0,  at  the  top.  Such  a  root  node  also  exists  in  the 
representation  of  continuous-time  signals  defined  on  a  compact  interval. 
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With  the  preceding  as  motivation  we  introduced  in  [7]  the  following  class  of  state- 
space  models  on  trees: 

x{t)  =  A{m{t))x{‘j~^t)  +  B{m{t))w{t)  (2.2) 

where  {w{t),t  €  T}  is  a  set  of  independent,  zero-mean  Gaussian  random  variables.  If 
we  are  dealing  with  a  tree  with  unique  root  node,  0,  we  require  w{t)  to  be  independent 
of  a:(0),  the  zero-mean  initial  condition.  The  covariance  of  w{t)  is  I  and  that  of  a:(0)  is 
P2;(0).  If  we  wish  the  model  eq.(2.2)  to  define  a  process  over  the  entire  infinite  tree,  we 
simply  require  that  w(t)  is  independent  of  the  “past”  of  x,  i.e.  {a;(T)|m(T)  <  m{t)]. 
If  A{m)  is  invertible  for  all  m,  this  is  equivalent  to  requiring  w{t)  to  be  independent 
of  some  x{t)  with  t  ^  t,  m{T)  <  m{t). 

Let  us  make  several  comments  about  this  model.  Note  first  that  the  model  does 
evolve  along  the  tree,  as  both  x{at)  and  x{l3t)  evolve  from  a:(t).  Secondly,  we  note 
that  this  process  has  a  Markovian  property:  given  x  at  scale  m,  x  at  scale  m  -f-  1  is 
independent  of  x  at  scales  less  than  or  equal  to  m  —  1.  Indeed  for  this  to  hold  all 
we  need  is  for  w  to  be  independent  from  scale  to  scale  and  not  necessarily  at  each 
individual  node.  Also  while  the  analysis  we  perform  is  easily  extended  to  the  case 
in  which  A  and  B  are  arbitrary  functions  of  t,  we  have  chosen  to  focus  here  on  a 
translation-invariant  model:  we  allow  these  quantities  to  depend  only  on  scale.  As 
we  will  see  this  leads  to  significant  computational  efficiencies  and  also,  when  this 
dependence  is  chosen  appropriately,  these  models  lead  to  processes  possessing  self¬ 
similar  properties  from  scale  to  scale. 

Note  that  the  second-order  statistics  of  x{t)  are  easily  computed.  In  particular 
the  covariance  Px{t)  =  E[x{t)x'^ (t)]  evolves  according  to  a  Lyapunov  equation  on  the 
tree: 

Px{t)  =  A{m{t))Px{j~^t)A^{m{t))  +  B{m(t))B'^{m{t))  (2.3) 

Note  in  particular  that  if  ^'^(t)  depends  only  on  m(T)  for  m(r)  <  m{t)  —  1,  then  Px{t) 
depends  only  on  m{t).  We  will  assume  that  this  is  the  case  and  therefore  will  write 
Px{t)  =  Rximit))-  Note  that  this  is  always  true  if  we  are  considering  the  subtree  with 
single  root  node  0.  Also  if  A{m)  is  invertible  for  all  m,  and  if  Px{t)  =  Pa;(m(t))  at 
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some  scale(i.e.  at  all  t  for  which  m{t)  equals  m  for  some  m),  then  Px{t)  =  Px{n^{i)) 
for  all  t.  Furthermore,  if  A(m{t))  =  A  is  stable  and  if  =  B,  let  P^;  be  the 

solution  to  the  algebraic  Lyapunov  equation 

Px  =  APxA^  +  BB'^  (2.4) 

In  this  case  if  Px(0)  =  Px(if  we  have  a  root  node),  or  if  we  assume  that  Px(t)  =  Px 
for  m(T)  sufficiently  negative^,  then  Px{t)  =  Px  for  all  t,  and  we  have  the  stationary 
model. 

As  we  will  see  in  a  moment,  the  multiscale  estimation  algorithm  we  will  ana¬ 
lyze  involves  a  fine-to-coarse  recursion  requiring  a  corresponding  version  of  eq.(2.2). 
Assuming  that  A{m)  is  invertible  for  all  m  we  can  directly  apply  the  results  of  [12]: 

x(7“^t)  =  F{m{t))x{i)  —  A~^{m{t))B(m{t))w(t)  (2.5) 

with 

F{m{t))  =  A-^{m{t))[I  -  B{m{t))B'^{m{t))p-'^{m(t))] 

=  Px{m{t)  -  l)A'^{m{t))PT'^{m{t))  (2.6) 

and  where 

w(t)  =  u;(t)  —  FJ[u;(i)|x(t)]  (2-7) 

E[w{t)uP' {t)]  =  I  —  B^{m{t))P~^{m{t))B{m{t))  (2.8) 

=  Q(rn(i)) 

Note  that  w{t)  is  a  white  noise  process  along  aU  upward  paths  on  the  tree  -  i.e.  u;(s) 
and  w{t)  are  uncorrelated  iit  =  j~'^s  or  s  =  for  some  r;  otherwise  w{s)  and  u3(i) 
are  not  uncorrelated. 


^Once  again  if  A  is  invertible,  if  P*(t)  =  Px  at  any  single  node,  Px{t)  =  Px  at  all  nodes. 
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In  [7]  we  consider  the  estimation  of  the  stochastic  process  described  by  eq.(2.2) 
based  on  the  measurements 

y{t)  =  C{m{t))x{t)  +  v{t)  (2.9) 

where  {v{t),t  gT}  is  a  set  of  independent  zero-mean  Gaussian  random  variables 
independent  of  a:(0)  and  6  T}.  The  covariance  of  v(t)  is  R{m{t)).  The  model 

eq.(2.9)  allows  us  to  consider  multiple  resolution  measurements  of  our  process.  The 
single  resolution  problem,  i.e.  when  C{m)  =  0  unless  m  =  M(the  finest  level),  is  also 
of  interest  as  it  corresponds  to  the  problem  of  restoring  a  noise  corrupted  version  of 
a  stochastic  process  possessing  a  multi-scale  description. 

Three  different  algorithm  structures  are  described  in  [7].  One  of  these  is  a  general¬ 
ization  of  the  well-known  Rauch- Tung-Striebel(RTS)  smoothing  algorithm  for  causal 
state  models.  Recall  that  the  standard  RTS  algorithm  involves  a  forward  Kalman  fil¬ 
tering  sweep  followed  by  a  backward  sweep  to  compute  the  smoothed  estimates.  The 
generalization  to  our  models  on  trees  has  the  same  structure,  with  several  important 
differences.  First  for  the  standard  RTS  algorithm  the  procedure  is  completely  sym¬ 
metric  with  respect  to  time  -  i.e.  we  can  start  with  a  reverse-time  Kalman  filtering 
sweep  followed  by  a  forward  smoothing  sweep.  For  processes  on  trees,  the  Kalman 
filtering  sweep  must  proceed  from  fine-to-coarse(i.e.  in  the  reverse  direction  from 
that  in  which  the  model  eq.(2.2)  is  defined)  followed  by  a  coarse- to-fine  smoothing 
sweep^.  Furthermore  the  Kalman  filtering  sweep,  using  the  backward  model  eq.’s(2.5- 
2.8)  is  somewhat  more  complex  for  processes  on  trees.  In  particular  one  full  step  of 
the  Kalman  filter  recursion  involves  a  measurement  update,  two  parallel  backward 
predictions(corresponding  to  backward  prediction  along  both  of  the  paths  descending 
from  a  node),  and  the  fusion  of  these  predicted  estimates.  This  last  step  has  no 
counterpart  for  state  models  evolving  in  time  and  is  one  of  the  major  reasons  for  the 
differences  between  the  analysis  of  temporal  Riccati  equations  and  that  presented  in 
this  paper. 

^The  reason  for  this  is  not  very  complex.  To  allow  the  measurement  on  the  tree  at  one  point  to 
contribute  to  the  estimate  at  another  point  on  the  same  level  of  the  tree,  one  must  use  a  recursion 
that  first  moves  up  and  then  down  the  tree.  Reversing  the  order  of  these  steps  does  not  allow  one 
to  realize  such  contributions. 
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x(tlt)  is  based  on  measurements  in 


x(tlt+)  is  based  on 


Figure  2:  Representation  of  Meaurement  Update  and  Merged  Estimates 
To  begin  let  us  define  some  notation: 


z= 

=  t  or  s  is 

a  descendent  of  f} 

= 

{y{s)\s 

e{ajyt 

,  m(s)  <  M} 

(2.10) 

= 

{y{s)\s 

,  t  <  m{s)  <  M} 

(2.11) 

= 

£[x(.)iy,i 

(2.12) 

x{-\t+) 

= 

i![x(.)|y+i 

(2.13) 

The  interpretation  of  these  estimates  is  provided  in  Figure  2. 

As  developed  in  [7],  the  Kalman  filter  and  Riccati  equation  recursions  have  the  fol¬ 
lowing  steps.  To  begin,  consider  the  measurement  update.  Specifically,  suppose  that 
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we  have  computed  x{t\t+)  and  the  corresponding  error  covariance,  P{m{t)\m{t)+); 
the  fact  that  this  depends  only  on  scale  should  be  evident  from  the  structure  of  the 


problem.  Then,  standard  estimation  results  yield 

x(i|t)  =  x{t\t+)  +  K{m{t))[y{t)  —  C{m{t))x{t\t+)]  (2.14) 

K{m{t))  =  P(m{t)\m{t)+)C^{Tn{t))V~^{m{t))  (2.15) 

V(m{t))  =  C{m{t))P{Tn{t)\rn{t)+)C^{rn{t))  +  R{rn{t))  (2.16) 

and  the  resulting  error  covariance  is  given  by 

P{m{t)\m{t))  =  [I  —  K{m{t))C{m{t))]P{m{t)\m(t)+)  (2.17) 


Note  that  the  computations  begin  on  the  finest  level(m(t)=M)  with  ^(t|i+)  =  0, 

P{M\M+)  =  P^{M). 

Suppose  now  that  we  have  computed  x{at\at)  and  x(l3t\^t).  Note  that  Yd  and 
Y^t  are  disjoint  and  these  estimates  can  be  calculated  in  parallel.  Furthermore,  once 
again  they  have  equal  error  covariances,  P{m{t)  +  l|m(t)  +  1).  We  then  compute 
x{t\at)  and  x{t\^t)  which  are  given  by 

x{t\at)  =  F{m{t)  +  l)x{at\cd)  (2.18) 

x{t\0t)  =  F{m{t)  +  l)x{^t\llt)  (2.19) 

with  corresponding  identical  error  covariances  given  by 

P{m{t)\m(t)  +  1)  =  F{m{t)  +  l)P{m{t)  +  l\m{t)  +  l)F^{m{t)  +  1)  +  Q{m{t)  A  1) 

(2.20) 

Q{m{t)  +  1)  =  A~^{m{t)  +  l)B{m{t)  +  l)Q{m{t)  +  +  l)A~^(m{t)  +  1) 

(2.21) 


These  estimates  must  then  be  fused  to  form  x(t|t+)  as  follows: 

£(t|i+)  =  P{rn{t)\m{t)+)P~^{rn{t)\m{t)  +  l)[s(<|at)  +  x(t\^t)] 


P{m{t)\m{t)+)  =  [2P  \m{t)\m{t)  +  l)-P^'^{t)]  ^ 


(2.22) 

(2.23) 
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The  interpretation  of  these  equations  is  that  we  are  fusing  together  two  estimates 
based  on  independent  sources  of  information,  namely  Yat  and  Y/^t,  and  on  one  common 
information  source,  namely  the  prior  statistics  of  x{t).  Eq.(2.23)  ensures  that  this 
common  information  is  accounted  for  only  once  in  the  fused  estimate. 

The  analysis  in  the  remainder  of  this  paper  focuses  on  the  upward  Kalman  filtering 
sweep.  For  completeness  we  describe  the  subsequent  downward  smoothing  sweep. 
Specifically,  when  we  reach  the  top  node  of  the  tree,  the  resulting  updated  estimate  is 
the  smoothed  estimate  at  that  point  which  then  serves  as  the  initial  condition  for  the 
downward  recursion  along  the  tree.  This  recursion  combines  the  smoothed  estimate 
Xs{'y~^'t)  with  the  filtered  estimates  from  the  upward  sweep  to  produce  Xs{t): 

Xs(t)  =  x{t\t)  +  P{m{t)\rn{t))F'^{m{t))P~^{m{t)  —  l|m(t))  Xs{'f~^t)  -  x{'y~^t\t)^ 

(2.24) 
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3  Maximum  Likelihood  Estimator 

In  this  section  we  examine  the  difficulties  in  analyzing  our  filtering  equations.  These 
difficulties  point  to  the  need  to  decompose  the  filter  into  two  parts;  one  representing 
the  filter  initialized  with  no  prior  information,  the  ML  filter,  and  the  other  represent¬ 
ing  our  estimate  of  the  mean  of  the  process. 

We  rewrite  the  set  of  Riccati  equations  for  our  filtering  problem  as  follows. 

P(m|m  +  1)  =  F(m -f  l)P(m -f  1  |m -t- l)jP^(m -b  1) 


G{pi  "b  l)Q(?7i  “b  +  1)  (3-f ) 

P~^{m\m)  =  +  C'^{m)R~^{m)C{rn)  (3.2) 

P~^(m|m'*')  =  2P“^(m|m -b  1)  —  P^^(m)  (3-3) 

where 

G{m{t))  =  —A~^{m(t))B{rn{t))  (3.4) 


Note  that  we  can  combine  eq.(3.2,3.3)  into  the  following  single  equation. 

P~^{m\m)  =  2P“^(m|m -b  1)  —  P”^(m) -b  C'^(m)P“^(m)C'(m) 

=  P“^(m|m  +  1)  +  C'^(m)P"^(m)C'(m) 

-b  P“^(m|m -b  1)  -  P~^(m)  (3.5) 

The  Riccati  equations  for  our  optimal  filter,  eq.’s(3.1-3.3),  differ  from  standard 
Riccati  equations  in  two  respects:  1)  the  explicit  presence  of  the  prior  state  covariance 
Px{m(t))  and  2)  the  presence  of  a  scaling  factor  of  2  in  eq.(3.3).  The  scaling  factor  is 
intrinsic  to  our  Riccati  equations  and  is  due  to  the  fact  that  we  are  fusing  pairs  of 
parallel  information  paths  in  going  from  level  to  level.  The  presence  of  Px{Tn{t))  in  the 
Riccati  equations  accounts  for  the  double  counting  of  prior  information  in  performing 
this  merge. 

The  presence  of  this  term  points  to  a  significant  complication  in  analyzing  this 
filter.  Specifically,  in  standard  Kalman  filtering  analysis  the  Riccati  equation  for 
the  error  covariance  can  be  viewed  simply  as  the  covariance  of  the  error  equations, 
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which  can  be  analyzed  directly  without  explicitly  examining  the  state  dynamics  since 
the  error  evolves  as  a  state  process  itself.  This  is  apparently  not  the  case  here 
because  of  the  explicit  presence  of  Px{m)  in  eq.(3.5).  Indeed  as  we  show  later  in 
this  section,  if  one  examines  the  backward  model  eq.’s(2.5-2.8)  and  the  Kalman  filter 
eq.’s(2.14,2.18,2.19,2.22)  one  finds  that  the  upward  dynamics  for  the  error  a:(<)  — ;r(t |t) 
are  not  decoupled  from  x{t)  unless  P~^{m{t))  =  0.  This  motivates  the  following 
decomposition  of  the  estimator  into  a  dynamic  part  based  on  P~^  =  0(the  ML  esti¬ 
mator)  followed  by  a  gain  adjustment  to  account  for  prior  information. 

To  be  precise,  let  Pml{i^\i^  +  1)  and  PuLi’i^l’’^)  denote  the  estimates  produced 
by  our  upward  Kalman  filter  assuming  that  P~^{m)  =  0.  These  satisfy  the  following 
Riccati  equation,  which  doesn’t  depend  explicitly  on  Px{m). 

PuLimlm  +  l)  =  F{m-{-l)PML{'m+I\‘m+l)F'^{m  +  l)  +  G{m  +  l)Q{m  +  l)G^{mi-l) 

(3.6) 

=  2P;;^^(m|m -M)  +  C'^(m)i?“^(m)C'(m)  (3.7) 

Note  that  the  filtering  equations  for  the  ML  estimator  correspond  exactly  with  the 
equations  for  the  optimal(Bayesian)  filter  with  and  Pml(”^|»^  +  1)  being 

substituted  for  P(m|m)  and  P(m|m  -f  1).  We  refer  to  these  as  the  ML  filtering 
equations. 

Before  elaborating  further  on  the  ML  estimator,  we  describe  its  relationship  to 


the  optimal  estimator.  The  two  are  related  in  the  following  way. 

i(t|i)  =  P{m{t)\m{t))PM\irn{t)\m{t))xML{t\t)  (3.8) 

p-\m{t)\m{t))  =  PM\{m{t)\m{t))  +  p-\m{t))  (3.9) 

To  derive  these  relationships  we  start  by  writing 

Yt  =  Htxit)  -f  e{t)  (3.10) 

E[e(t)x'^{t)]  =  0  (3.11) 

E[0{t)e'^{t)]  =  Rt  (3.12) 


where 
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Recall  that  Yt  is  the  set  {y(5)|js  =  t  or  s  is  a  descendent  of  t}.  Eq.(3.10)  follows 
directly  from  our  downward  model  for  the  process  x(i)  on  the  tree.  From  eq,(3.10) 
we  can  write  the  following  maximum  likehhood  estimate. 

XMLim  =  (3.13) 

R^l,(m(t)im(t))  =  (3.14) 

Note  that  XML(t\t)  can  be  computed  using  ML  filtering  equations.  This  is  true  since 

the  ML  filter  computes  the  best  estimate  in  the  sense  of  minimizing  the  mean-square 
error  given  no  initial  prior  information,  which  from  the  invertibility  of  F{m)  and 
from  our  Lyapunov  equation  for  the  evolution  of  the  state  covariance  is  equivalent 
to  the  best  estimate  at  some  point  t  given  P~^{m{t))  =  0.  Furthermore,  since  0{t)  is 
uncorrelated  with  x{t)  we  can  write  the  Bayesian  estimate  as  follows. 

(3.15) 

p-\m{t)\m{t))  =  PM\irn{t)\m{t))  +  p-\rn{t))  (3.16) 

where  m(x(^)))  is  the  mean  of  x{t).  But  since  we  consider  x(t)  to  be  a  zero-mean 
process  eq.(3.15)  and  eq.(3.8)  are  equivalent. 

There  are  several  reasons  for  viewing  the  optimal  estimator  in  this  way.  One  is 
that  the  ML  Riccati  equations  are  simpler  because  they  do  not  include  the  explicit 
presence  of  the  prior  information  P~^{m{t)).  This  simplicity  is  significant  in  that 
the  ML  Riccati  equations  are  readily  amenable  to  stability  analysis.  The  important 
reason  mentioned  previously  for  focusing  our  analysis  on  the  ML  filter,  and  perhaps 
a  deeper  one,  is  that  the  error  dynamics  for  the  optimal  filter  cannot  be  written  as 
a  noise  driven  process  with  closed-loop  dynamics  wherea.s  the  error  dynamics  for  the 
ML  filter  can.  Let  us  flesh  out  this  last  point  in  more  detail. 

Let  us  begin  by  examining  the  dynamics  of  our  filter  in  the  upward  sweep  of  the 
RTS  algorithm,  eq.’s(2.14-2.17,  2.18-2.21,2.22,2.23).  We  can  rewrite  the  dynamics  of 
the  filter  in  update  form,  eq.(2.14),  as  follows. 


x{t\t)  =  L{m{t))F{m{t)  -|-  l)(£(Q:t|af)  -|-  x{/3t\^t)) 
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+  K{mit))y{t)  (3.17) 

L(m{t))  =  P{m{t)\m{t))P~^{rn{t)\m{t)  +  1)  (3.18) 

We  can  also  write  the  dynamics  for  our  process  in  a  similarly  symmetric  form. 

x{t)  =  ^F{m{t)  +  l)[a;(Q:t)  +  x{l3t)]  +  ^G{m{t)  +  l)[«j(Q;<)  +  w{l3t)]  (3.19) 

We  can  easily  rewrite  eq.(3.17)  as 

x(t\t)  =  (I  —  K{m{t))C{m{t)))L'{m{t))F(m{t) +  l){x(at\at) 

+  x{l3t\l3t))  +  K{m{t))y{t)  (3.20) 

L'{m{t))  =  P{rn{t)\m{t)+)P~^{m{t)\m{t) -{■  1)  (3.21) 

By  doing  straightforward  manipulations  on  eq.(3.20)  and  eq.(3.19)  we  can  get 

x(t|t)  =  {I  —  K{m{t))C{m{t)))x{t)  —  K{m{t))v{t) 


—  (I  —  K(m{t))C{m{t)))L'{m{t))F{m{t)  +  l){x{at\at)  +  x{f3t\^t)) 

(3.22) 

x{t\t)  =  x{t)  —  x(t|i)  (3.23) 

The  difficulty  in  proceeding  any  further  with  eq.(3.22)  lies  in  the  presence  of  the  term 
L'{m{t)).  In  standard  filtering  L'{m{t))  =  I;  said  another  way  there  is  no  difference 
between  P{m{t)\m{t)+)  and  P{m{t)\m{t)  +  1).  Let  us  write  down  the  equations  for 
the  ML  filter  and  its  corresponding  error. 

XMLit\t)  =  ^(1  -  KML{m{t))C{m{t)))F{m{t)  +  l){xML{at\at)  +  xml{W^)) 

+  KML{m{t))y{t)  (3.24) 

XML{t\t)  =  (I  -  KML{m{t))C{m{t)))x{t)  -  KMLim{t))v{t) 

-  i(/  -  KMLimit))C{m{t)))F{m{t)  +  l){xML{oct\at)  +  XML{^t\Pt) 

(3.25) 
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By  substituting  eq.(3.19)  into  eq.(3.25)  we  get 

=  ^(1 -KML{'rri{t))C{rn{t)))F{m{t)  +  l){xML{o^t\at)  +  XML{Wt) 

+  ^(/  -  KML{m{t))C{m{t)))G{m{t)  +  l)(tli(at)  +  wdSt))  -  KML{m(t))v{t) 

(3.26) 

Note  that  eq.(3.26)  has  the  same  algebraic  structure  as  the  the  equations  for  the  error 
dynamics  of  the  standard  Kalman  filter  except  for  the  scaling  factor  of  |  and  the  fact 
that  there  are  two  terms  in  the  immediate  past  being  merged.  Both  the  scaling  factor 
and  the  merging  of  pairs  of  points  is  crucial  to  the  study  of  the  stability  of  the  filter. 

As  we  will  see  in  Section  7  the  appropriate  scaling  factor  is  necessary  for  controlling 
in  some  sense  the  potential  growth  that  might  occur  in  merging  points. 

Also,  for  future  reference,  let  us  rewrite  eq.(3.26)  using  the  following  equality: 

i(/  -  KML{m{t))C{m{t)))  =  PML{rn{t)\m{t))Pj;f\{m{t)\m{t)  +  1)  (3.27) 

We  can  rewrite  eq.(3.26)  as 

XMLmi)  =  PML{rn{t)\rn{t))PM\irn{t)\m{t)  +  l)F(m(i)  +  l){xML{oit\at)  +  XMiiPtlPt) 

+  PML{rn{t)\m{t))Pj;f\{m{t)\m{t)  +  l)G(m(t)  +  l)(u;(at)  +  w{l3t))  -  KML{rri{:t))v{t) 

(3.28) 
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4  Reachability,  Observability,  and  Reconstructibil- 
ity 

In  this  section  we  develop  certain  system  theoretic  constructs  which  are  useful  in 
analyzing  both  the  stabihty  and  the  steady-state  characteristics  of  our  filter.  In 
particular  we  define  notions  of  reachability,  observability,  and  reconstructibility  on 
dyadic  trees  in  terms  of  system  dynamics  going  up  the  tree. 

4.1  Upward  Reachability 

We  begin  with  the  notion  of  reachability  for  a  system  defined  going  up  a  tree.  Anal¬ 
ogous  to  the  standard  time-series  case,  reachability  involves  the  notion  of  being  able 
to  reach  arbitrary  states  at  some  point  t  on  the  tree  given  arbitrary  inputs  in  the 
past  where  in  the  case  of  processes  evolving  up  a  tree  the  past  refers  to  points  in  the 
subtree  under  t.  Recall  that  we  can  rewrite  the  dynamics  for  our  backward  process 
up  the  tree,  eq.(2.5),  in  the  following  form. 

x{t)  =  -f  l)[a;(at)  -f  x{l3t)]  -1-  ^G(m(t)  +  l)[ui(at)  -j-  w{/3t)]  (4.1) 

Also,  recall  that  in  our  backward  model  w{t)  is  a  white  noise  process  along  upward 
paths  on  the  tree.  For  the  analysis  of  reachabihty,  however,  we  simply  view  w{t)  as 
the  input  to  the  system  eq.(4.1). 

We  define  the  following  vectors, 

(4.2) 

WM,to  =  [  w’^ioito)  'uF{^to)  ...  Y  (4-3) 

which  have  the  following  interpretation.  Consider  an  arbitrary  point  on  the  tree, 
to.  The  vector  XM,to  denotes  the  vector  of  2^  points  at  the  Mth  level  down  in  the 
subtree  under  to;  i-e.  XM,to  includes  all  of  the  nodes  at  this  level  that  influence  the 
value  of  x{to).  The  vector  Wu.to  comprises  the  full  set  of  inputs  that  influences  x{to) 
starting  from  initial  condition  XM,to,  i-e.  the  w(t),  in  the  entire  subtree  down  to  M 
levels  from  to-  We  define  upward  reachability  to  be  the  following. 
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Definition  4.1  The  system  is  upward  reachable  from  XM,to  ®(^o)  */  given  any 
Xmm  desired  s(fo);  it  is  possible  to  specify  WM,to  so  if  Xm, to  =  Xmm) 

x{to)  =  «(io). 

In  studying  conditions  for  reachability  since  we  are  given  Xufy ,  we  can  set  it  equal 
to  zero  without  loss  of  generality.  Note  that  if  XM,to  =  0,  then  we  have 

x{to)  =  QWM,to  (4.4) 

where 

g  =  [  ^(0)  ^(0)  ^(1)  ^(1)  ^(1)  $(1)  ...  (4.5) 

^(M  -  2)...^(M  -  2)  ^(M  -  1)...'I'(M  -  1)  , 

' - .... - /  ' - ^  J 

2*^-1  times  2^  times 

^(*)  =  (i)’'''V(»^(^o),m(io)  +  0G'(m(io)  +  ^  +  1)  (4.6) 

f>(mi,m2)  =  I  ^  rni  m2 

[  F(mi  +  l)4>{mi  +  l,m2)  mi  <  m2 

^(m  — l,m)  =  F{m)  (4.8) 

Let  us  also  define  the  following  quantity. 

Definition  4.2  Upward-reachability  Grammian 
Tl{to,M)  =  QQ^ 

M-l 

=  Y1  2“*~  V("i(^o),  m{to)  +  i)G{m{to)  +  i  +  1) 

t=0 

X  G^(m(to)  +  i  +  l)^i'^(m(to),  m(fo)  +  0  (4.9) 

From  eq.(4.4)  we  see  that  the  ability  to  reach  all  possible  values  of  a: (to)  given  arbitrary 
inputs,  depends  on  the  rank  of  the  matrix  Q.  This,  along  with  the  fact  that 

the  rank  of  Q  equals  the  rank  of  gg^,  leads  to  the  following,  where  x{t)  is  an  n- 
dimensional  vector: 

Proposition  4.1  The  system  is  upward  reachable  from  XM,to  lo  x{to)  iff  g  has 
rank  n  iff  R{to,M)  has  rank  n. 
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Note  that  Tl{to,  M)  bears  a  strong  similarity  to  the  standard  reachability  grammian 
for  the  following  system. 

x{rn)  =  \-F{m  +  l)x(m  +  1)  +  —G{m  +  l)u(m  +  1)  (4-10) 

where  the  reachability  grammian  in  this  case  is 

TV{m,  m  +  M)  =  m  +  i)G{m  +  i  +  1) 

4=0 

X  G^{m  +  i  +  l)^^(m,  m  +  i) 

=  6’(Q'f 

g'  =  [  «(0)  S(l)  ...  «(M-2)  (4.11) 

In  fact  it  is  evident  from  the  definitions  in  eq.’s(4.5,4.11)  that  the  rank  of  Q  is  equiv¬ 
alent  to  the  rank  of  ^*.  This  leads  to  the  following  corollary. 

Corollary  4.1  The  system  is  upward  reachable  from  XM,to  to  x{to)  iff  for  any 
1Z*  p{m(to),m{to)  +  M)  has  rank  n,  where  7i*  p{m{tQ),m{to)  +  M)  is  the 
reachability  grammian  for  the  system 

x{m)  =  aF{m  +  l)x(m  +  1)  +  l3G{m  -f  l)u(m  +  1)  (4-12) 

Note  that  if  F  and  G  are  constant  in  eq.(4.1),  then  reachability  is  equivalent  to  the 
usual  condition,  i.e.  rank[G\FG\...\F^~^G]  =  n. 

4.2  Upward  Observability  and  Reconstructibility 

We  develop  the  notion  of  observability  and  the  notion  of  reconstructibility  on  trees. 
Defined  on  trees,  observability  corresponds  to  the  notion  of  being  able  to  uniquely 
determine  the  points  at  the  bottom  of  a  subtree,  i.e.  the  “initial  conditions” ,  given 
knowledge  of  the  inputs  and  observations  in  the  subtree.  It  is  also  useful  to  develop 
the  weaker  notion  corresponding  to  being  able  to  uniquely  determine  the  single  point 
at  the  top  of  a  subtree  given  knowledge  of  the  inputs  and  observations  in  the  subtree. 
This  notion  is  analogous  to  reconstructibility  for  standard  systems;  thus,  we  adopt 
the  same  term  for  the  notion  on  trees. 
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Let  us  define 

=  [  y^(to)l  y^(o!to),  y^(/4to)l  ...  ly^(a^to),...y^(^^to)  f 

where 

y(t)  =  C(m(t))x(t) 

Definition  4.3  The  system  is  upward  observable  from  XM,to  to  x{to) 
knowledge  o/Wm^  YM,to,  we  can  uniquely  determine  XM,to- 

Note  that  if  Wufy  —  0  then 

^M,to  ~  TlMXM,to 


(4.13) 

(4.14) 
if  given 

(4.15) 


where  TIm  is  most  easily  visualized  if  we  partition  it  compatibly  with  the  levels  of 
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the  observations  in  YM,to  • 


2^  blocks 


0(0) 

0(0)  ... 

,  0(0) 

0(1) 

» » • 

,,, 

0(1) 

0 

«• . 

0 

0 

... 

... 

0 

0(1)  ... 

... 

0(1) 

0(2) 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

0(2)  ... 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

0(2)  ... 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

0(2)  ... 

0(2) 

0(M)  0  ...  ...  0 

0  0(M)  ...  ...  0 


0  0  ...  ...  0(M) 

(4.16) 

Here 

®(*)  =  +  i)(f>(m{to)  +  i,  m{to)  +  M)  (4-17) 

As  a  simple  example  to  help  clarify  the  structure  of  the  matrix  TCm  consider  the 
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matrix  ‘H2  for  the  scale- invariant  case,  i.e.  where  F{m.)  =  F,  C{m)  =  C. 

■  IQJP2  iQp2  iQp2  ■ 

4  4  4  4 

\CF  \CF  0  0 

0  0  \CF  \CF 

C  0  0  0  (4.18) 

0  (7  0  0 

0  0  C  0 

0  0  0  C 

That  is,  at  level  i,  there  are  2*  measurements  each  of  which  provides  information 
about  the  sum  of  a  block  of  2^“’  components  of  Xmm-  Note  that  this  makes  clear 
that  upward  observability  is  indeed  a  very  strong  condition.  Specifically,  since  suc¬ 
cessively  larger  blocks  of  XM,to  summed  as  we  move  up  the  tree,  subsequent 
measurements  provide  no  information  about  the  differences  among  the  values  that 
have  been  summed.  For  example  consider  Af  =  1.  In  this  case  y{t)  contains  infor¬ 
mation  about  the  sum  x{at)  and  thus  information  about  x{at)  —x{^t)  must 

come  from  y{at)  and  y{j3t).  This  places  severe  constraints  on  the  system  matrices. 
In  particular  a  necessary  condition  for  observability  is  that  y  have  dimension  larger 
than  |^(otherwise  "Hm  has  fewer  rows  than  columns). 

We  also  define  the  following. 

Definition  4.4  Upward-observability  Grammian 

Mm  =  nhHM  (4.19) 

where 


Mk  =  U(A),0)  (4.20) 

A  ^  1 

U {k,  k)  =  +  b  "^(^0)  +  k)C{m{to)  +  i)(j){m{to)  +  i,  m(to)  +  k) 

!=0  ^ 

(4.21) 

C(k)  =  Cf{k)C(k)  (4.22) 


U{k,l  +  1)^ 


(4.23) 
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and  S{k,l)  is  a  block  matrix  with  2*  ^  ^  x  2*  ^  ^  blocks  each  of  which  equals 

T{k,  1)  =  rn{to)+k)C'^{m{to)+i)C{rn{to)+i)(f>{rn{to)+i,  m{to)+k) 

i=0  2 

(4.24) 

Once  again  we  consider  the  scale-invariant  case,  this  time  in  order  to  make  explicit 
the  structure  of  the  matrix  A4m-  The  following  is  M2  for  the  scale-invariant  case. 

Ml  M2  Ms  Ms 
M2  Ml  Ms  Ms 
Ms  Ms  Ml  M2 
Ms  Ms  M2  Ml 

where 


Ml  = 

-f  \fc^cf  +  C'^C 

16  4 

(4.26) 

M2  = 

Lf^^ctcf^  -h  \fc^cf 

16  4 

(4.27) 

Ms  = 

—F^'^C^CF'^ 

16 

(4.28) 

From  eq.(4.15)  we  see  that  being  able  to  uniquely  determine  XM,to  from  iM,<o  is 
equivalent  to  requiring  the  null  space  of  the  matrix  TYm  to  be  0.  This  leads  to  the 
following. 

Proposition  4.2  The  system  is  upward  observable  from  XM,to  to  ^{to)  iff  AfifHM)  = 
0  ff  Mm  is  invertible. 

A  much  weaker  notion  than  that  of  observability  is  the  notion  of  reconstructibility. 
Reconstructibility  requires  only  the  ability  to  determine  the  single  point  at  the  top 
of  a  subtree  given  knowledge  of  the  inputs  and  observations  in  the  subtree. 

Definition  4.5  The  system  is  upward  reconstructible  from  Xu.to  to  ^(to)  if  given 
knowledge  ofWM,to  o,ndYM,to)  oan  uniquely  determine  x{to). 


(4.25) 


We  also  define  the  following. 


4  REACHABILITY,  OBSERVABILITY,  AND  RECONSTRUCTIBILITY 


24 


Definition  4.6  Upward-reconstructibility  Grammian 
O{to,  M)  =  ImH^HmIm 

M 

=  2*<^^(m(fo)  +  i,  m{to)  +  M)C^{m{to)  +  i) 

i=0 

X  C{m(to)  +  i)<f>{m{to)  +  i,  m{to)  +  M)  (4.29) 

where 

Im  =  [£i£|^  (4.30) 

2^  times 

and  each  I  is  an  n  x  n  identity  matrix. 

Note  that  if  Wufy  =  0,  then 

x{to)  —  ^(to)^M,to  (4-31) 

where 

$(to)  =  (i)^<^(m(to),  m{to)  +  M)Im  (4.32) 

Since  the  condition  of  reconstructibility  only  requires  being  able  to  uniquely  deter¬ 
mine  the  single  point  x{to)  from  the  measurements  in  the  subtree,  we  guarantee 
this  condition  by  requiring  that  any  vector  in  the  nullspace,  ViHu),  is  also  in  the 
nullspace,  jV"($(to)).  We  thus  have  the  following,  the  proof  of  which  can  be  found  in 
the  appendix. 

Theorem  4.1  The  system  is  upward  reconstructible  iff  C  W($(to)).  If 

F{m)  is  invertible  for  all  m,  this  is  equivalent  to  the  invertibility  of  0{to,M). 

Note  that  O{to,  M)  bears  a  strong  similarity  to  the  standard  observability  grammian 
for  the  following  system. 

x(m)  =  Q:F(m  +  l)a;(m  +  1)  +  C?(m  +  l)u(m -f- 1)  (4.33) 

y{m)  =  ^C{m)x{m)  (4.34) 

where  the  observability  grammian  in  this  case  is 

A  ^ 

C?a,^(m(fo),  rn{to)  +  M)  =  Y  {m{to)  +  i,  m{to)  +  M)C'^(m(to)  +  i) 

i=0 

X  C{m{to)  +  i)(f>{m{tQ)  +  i,m{to)  +  M)  (4.35) 
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Corollary  4.2  Assuming  that  F{m)  is  invertible  for  all  m,  the  system  is  upward 
reconstructible  from  XM,to  to  x{to)  iff  Oa,i3{'m{to),m{to)  +  M)  has  rank  n. 

As  a  final  note,  let  us  conunent  on  some  similarities  and  differences  between  these 
concepts  and  results  and  those  for  standard  temporal  systems.  First,  for  standard 
systems  observability  implies  reconstructibihty  and  the  two  concepts  are  equivalent 
if  the  state  transition  matrix  is  invertible.  In  our  case,  observability  certainly  implies 
reconstructibihty,  but  the  former  remains  a  much  stronger  condition  even  if  (f)  is  in¬ 
vertible.  In  this  case  reconstructibihty  is  equivalent  to  being  able  to  determine  the 
average  values  of  the  components  of  the  initial  state  [6].  Note  that  in  contrast  our 
reachability  concept  going  up  the  tree  is  actuaUy  rather  weak  since  we  have  many 
control  inputs  in  the  subtree  to  achieve  a  single  final  state  a;(to)-  As  one  might  ex¬ 
pect  there  is  a  dual  theory  for  systems  defined  moving  down  the  tree,  but  the  tree 
asymmetry  leads  to  some  important  differences.  In  particular,  weak  and  strong  con¬ 
cepts  are  interchanged.  For  example,  observability  is  concerned  with  determining  the 
single  initial  state  given  observations  in  the  subtree  under  to,  while  reconstructibihty 
corresponds  to  determining  the  entire  vector  In  this  case  if  (f)  is  invertible 

observability  is  equivalent  to  determining  the  average  value  of  Xmm-  Similarly, 
reachability  is  concerned  with  reaching  arbitrary  values  for  the  entire  vector  XM,toi 
an  extremely  strong  condition.  A  natural  and  much  weaker  condition  is  achieving  an 
arbitrary  average  value  for  XM,to-  ^  complete  picture  of  this  system  theory  will  be 
given  in  [6]. 
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5  Bounds  on  the  Error  Covariance  of  the  Filter 

In  the  following  sections  we  will  analyze  the  stability  of  our  upward  Kalman  filter  via 
Lyapunov  methods.  As  we  will  see  our  analysis  of  the  ML  filter  will  require  bounds 
on  PML{fn\fn),  and  it  will  also  be  necessary  to  have  bounds  on  P{m\m)  in  order  to 
infer  stability  of  the  optimal  filter.  Thus,  in  this  section  we  begin  by  deriving  strict 
upper  and  lower  bounds  for  the  optimal  filter  error  covariance  P{m\m).  We  then 
use  analogous  arguments  to  derive  upper  and  lower  bounds  for  the  ML  filter  error 
covariance  Existence  of  these  bounds  depends  on  conditions  that  can  be 

expressed  in  terms  of  the  notions  of  upward  reachability  and  upward  reconstructibility 
developed  in  the  previous  section. 

Recall  our  system  whose  dynamics  are  described  by  eq.(4.1)  and  whose  measure¬ 
ments  are  described  by  eq.(4.14).  We  define  the  stochastic  reachability  grammian  for 
this  system  as  follows. 

Definition  5.1  Stochastic  Reachability  Grammian 

_  M-l 

i=0 

X  Q{m{tQ)  +  iAl)G'^{rn{to)  +  i  +  l)(fF{m{tQ),rn{to)  +  i)  (5.1) 
We  define  the  stochastic  reconstructibility  grammian  for  this  system  as  follows. 

Definition  5.2  Stochastic  Reconstructibility  Grammian 

_  A  ^ 

i=0 

X  R~^{m{to)  +  i)C{m{to)  +  i)^{m{to)  +  i,  m{to)  +  M)  (5.2) 

Among  the  assumptions  that  we  make  under  which  we  prove  our  bounds  is  that 
the  matrices  F{m),  F~^{m),  G{m),  Q{m),  C(m),  R{m),  and  R~^{m)  are  bounded 
functions  of  m.  In  terms  of  our  reachability  and  reconstructibility  grammians  these 
assumptions  mean  that  for  any  Mq  >  0  we  can  find  a,  ^0  >  0  so  that 


Mq)  <  al  for  all  t 
OiUMo)  <  jSIioTallt 


(5.3) 

(5.4) 
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We  define  the  notion  of  uniform  reachability  as  follows. 

Definition  5.3  An  upward  system  is  uniformly  reachable  if  there  exists^,  Mo  >  0 
so  that 

7l{t,  Mo)  >  7/  for  all  t  (5.5) 

This  property  insures  that  the  process  noise  contributes  a  steady  stream  of  uncertainty 
into  the  state.  Intuitively,  we  would  expect  in  this  case  that  the  error  covariance 
P(m|m)  would  never  become  equal  to  zero.  In  fact  we  prove  that  under  uniform 
reachabilty  P(m|m)  is  lower  bounded  by  a  positive  definite  matrix. 

We  also  need  the  notion  of  uniform  reconstructibility,  which  is  formulated  as 
follows. 

Definition  5.4  An  upward  system  is  uniformly  reconstructible  if  there  exists 
6,  Mo  >  0  so  that 

0{t,  Mo)  >  81  for  all  t  (5-6) 

where  M  is  the  bottom  level  of  a  tree. 

This  property  insures  a  steady  flow  of  information  about  the  state  of  the  system. 
Intuitively,  we  would  expect  that  under  this  condition  the  uncertainty  in  our  esti¬ 
mate  remains  bounded.  In  fact  we  prove  that  under  the  condition  of  uniform  recon¬ 
structibility  the  error  covariance,  P(m|m),  is  upper  bounded. 

Without  loss  of  generality  we  can  take  Mq  to  be  the  same  in  eq.’s(5.3-5.6)  for  any 
system  which  is  uniformly  reachable  and  reconstructible. 

5.1  Upper  Bound 

We  begin  by  deriving  an  upper  bound  for  the  optimal  filter  error  covariance,  P(m|m). 
The  general  idea  in  deriving  this  bound  is  to  make  a  careful  comparison  between  the 
Riccati  equations  for  our  optimal  filter  and  the  Riccati  equations  for  the  standard 
Kalman  filter.  First  consider  the  following  lemma. 
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Lemma  5.1  Given  the  Riccati  equation 

P{m\m  +  1)  =  F{m  +  l)P(m  +  1  |m  +  l)F^(m  +  1) 

+  G{m  +  l)Q{m  +  l)G^{m  +  l)  (5.7) 

P~^{m\m)  =  P~^{m\m  +  1)  +  C^{m)R~^{m)C{m) 

+  p-\m\m  +  1)  -  P~\m)  (5.8) 

and  the  Riccati  equation 

P(m|m  4- 1)  =  P(m  +  l)P(m  +  l|m  4  l)F^(m  +  1) 

4  G(m4l)Q(m4l)G^(m4l)  (5.9) 

P~^{m\m)  =  P  ^{m\m  +  1)  +  C^{m)R~^{Tn)G{rn)  (5.10) 

we  have  that 

P~^{m\m)  <  P~^{m\m)  (5-11) 

Proof 

We  first  note  that  eq.(5.8)  can  be  rewritten  as 

p-\m\m)  =  p-\m\m  4  1)  4  C^{m)R-\m)C{m)  4  P^(m)P(m)  (5.12) 

where  D^{m)D{m)  is  positive  semi-definite.  This  follows  from  the  fact  that 
P(m|m  4  1)  <  Pxim)  or  P~^{m\m  4  1)  -  Pj'^(m)  >  0.  The  Riccati  equation, 
eq.’s(5.9,5.10),  characterizes  the  error  covariance  for  the  optimal  filter  corresponding 
to  the  following  filtering  problem. 

x(m)  =  F{m  +  l)x{m  +  1)  +  G{m  +  l)w{m  +  1)  (5.13) 

E[w{m)w^  (m)]  =  Q{m)  (5-14) 

y(m)  =  C{m)x{m)  +  v{m)  (5.15) 

E[v{m)v'^  (m)]  =  R{m)  (5.16) 


Similarly,  the  Riccati  equation,  eq.’s(5.7,5.12),  characterizes  the  error  covariance  for 
the  optimal  filter  corresponding  to  the  filtering  problem  involving  the  same  state 


5  BOUNDS  ON  THE  ERROR  COVARIANCE  OF  THE  FILTER 


29 


equation,  eq.(5. 13,5.14),  but  with  the  following  measurement  equation. 


y{m) 
E[u{m)u^  (m)] 


C{m) 

D{m) 


x{m)  +  u{m) 


R{m)  0 
0  I 


(5.17) 

(5.18) 


Since  the  filter  corresponding  to  eq.(5.7,5.12)  uses  additional  measurements  compared 
to  the  filter  corresponding  to  eq.(5.9,5.10),  its  error  covariance  can  be  no  worse  than 
the  error  covariance  of  the  filter  using  fewer  measurements;  i.e.  P{m\m)  <  P{m\m) 
or  P”^(m|m)  <  P~^(m|m). 

□ 

We  now  state  and  prove  the  following  theorem  concerning  an  upper  bound  for 


Theorem  5.1  Given  uniform  upper  boundedness  of  the  stochastic  reconstructibility 
grammian,  i.e.  eq.(5.4),  and  given  uniform  reconstructibility  of  the  system  there  exists 
K  >  0  such  that  for  all  m  at  least  Mq  levels  from  the  initial  level  P{m\m)  <  kI. 


Proof 

Consider  the  following  set  of  standard  Riccati  equations. 

P(m|m  +  1)  =  P(m  +  l)P(m  +  l|m  +  l)F^(m  +  1) 

+  G{m  +  l)Q{m  +  l)G^{m  +  1)  (5.19) 

P  ^(m|m)  =  P  ^(m|m  +  1)  +  C'^(m)P“^(m)(7(m)  (5.20) 

From  standard  Kalman  filtering  results  we  know  that  given  (P(m),P''^(m)C(m))  is 
a  uniformly  observable  pair  that  is  bounded  above,  there  exists  a  /c  >  0  such  that 
P(m|m)  <  kI  OT  P  ^(m|m)  >  k~^I.  But  by  Corollary  4.2,  (P(m), P“2 (m)C'(m)) 
being  a  uniformly  observable  pair  is  equivalent  to  the  original  system  being  uniformly 
reconstructible.  Also,  the  grammian  (P(m),P“2(m)C'(m))  being  bounded  above  is 
equivalent  to  our  assumption  of  uniform  upper  boundedness  of  the  stochastic  recon¬ 
structibility  grammian.  Thus,  under  uniform  reconstructibility  and  the  uniform  upper 
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boundedness  of  the  stochastic  reconstructibility  grammian  of  the  original  we  deduce 
that  P  ^(m|m)  >  But  from  Lemma  5.1  we  know  that  P  ^(m|m)  < 

Thus,  P“^(m|m)  >  k~^I  or  P(m|m)  <  kI. 

□ 

We  can  easily  apply  the  previous  ideas  to  derive  an  upper  bound  for  Pml{^\'>^)- 
Note  that  Lemma  5.1  would  still  apply  if  eq.(5.8)  did  not  have  the  P”^(m)  term;  i.e. 
the  lemma  would  apply  to  the  case  of  the  ML  Riccati  equations.  Then  by  using  the 
same  argument  used  to  prove  Theorem  5.1  we  can  show  the  following  theorem. 

Theorem  5.2  Given  uniform  upper  boundedness  of  the  stochastic  reconstructibility 
grammian,  i.e.  eq.(5.4),  and  given  uniform  reconstructibility  of  the  system  there  exists 
k'  >  0  such  that  for  all  m  at  least  Mq  levels  from  the  initial  level  Pml(>^|^)  < 

5.2  Lower  Bound 

We  now  derive  a  lower  bound  for  P(m|m).  As  in  deriving  the  upper  bound,  we  appeal 
heavily  to  standard  system  theory. 

Lemma  5.2  Let 

= 

^(mlm  +  l)  = 

Given  the  Riccati  equation 

5'*(m|m  +  l)  =  2P“^(m  +  l)5'*(m  +  l|m  +  l)P~^(m  +  1) 

+  P“^(m  +  l)C'^(m)P“'^(m)C'(m)P'"^(m  +  1)  (5.23) 

S’*  ^(m|m)  =  S'*  ^(m|m  +  1)  +  G(m  +  l)(5(m  +  l)G^(m  +  1)  (5.24) 

where  P(0|0)  =  5*(0|0).  Then  for  all  m  S'*(m|m)  >  S'(m|m). 


^(P  ^(m|m)  —  C^(m)P  ^(m)C'(m)  +  P^.  ^(m))  (5.21) 

F~^{m  +  l)P~^(m  +  l|m  +  l)P“^(m  +  1)  (5.22) 


Proof 
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By  substituting  eq.(5.12)  into  eq.(5.21)  and  collecting  terms  we  get 

S'(m|m)  =  P“^(m|m  +  1)  (5.25) 

By  substituting  eq.(3.1)  into  eq.(5.25)  we  arrive  at 

=  [F(m  +  l)P(m  +  l|m  +  +  1) 

+  G(m  +  l)<5(m  +  l)G^(m  +  1)]“^ 

=  [5~^(m|m  +  1)  +  G(m  +  l)(5(m  +  l)G^(m  +  1)]“^  (5.26) 

where  the  the  last  equality  results  from  the  substitution  of  eq.(5.22).  Also,  by  sub¬ 
stituting  eq.(5.21)  into  eq.(5.22)  and  collecting  terms  we  get 

S'(m|m -f  1)  =  2P“^(m -f  l)5'(m -f  l|m -f  l)F“^(m -f- 1) 

A  F~^{m  -f  l)C^(m)i?“^(m)(7(m)P”^(m  -f  1) 

-  F-^(m  -h  l)p-^(m)F-^(m  +  1)  (5.27) 

Now  we  prove  by  induction  that  for  all  m  5*(m|m)  >  ^(mlm).  Obviously,  5*(0|0)  > 
5(010).  As  an  induction  hypothesis  we  assume  S*{i  -|-  l|i  -f  1)  >  S{i  +  l|i  -f  1).  From 
eq.(5.27),  eq.(5.23),  and  the  fact  that  F~^{mA  l)P~^(m)F“^(m-|- 1)  >  0  we  get  that 

^’"'(iH  +  l)  <5■\^•|^■)  (5.28) 

Substituting  eq.(5.24)  and  eq.(5.26)  into  eq.(5.28)  and  cancelling  terms  we  arrive  at 
5*~^(i|i)  <  5~^(i|i),  i.e.  5*(^|^)  >  5(i|i). 

□ 

Theorem  5.3  Given  uniform  upper  boundedness  of  the  stochastic  reachability  gram- 
mian,  i.e.  eq.(5.3),  and  given  uniform  reachability  of  the  system  there  exists  L  >  0 
such  that  for  all  m  at  least  Mq  levels  from  the  initial  level  P(m|m)  >  LI. 


Proof 
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Consider  the  following  set  of  standard  Riccati  equations. 

5'*(m|m  +  l)  =  2F“^(m  +  l)*S'*(m +■  l|m  +  l)F~^(m  +  1) 

+  l)C^{m)R~^{m)C{m)F~^{m  +  1)  (5.29) 

S*  (m|m)  =  S*  (m|m  +  1)  +  G(m  4- +  1)G^(^  +  1)  (5.30) 

From  standard  Kalman  filtering  results  we  know  that  if  {F~‘^{m),G{m)Q2[m))  is 
a  uniformly  reachable  pair  that  is  bounded  above,  then  there  exists  >  0  such 
that  S'*(m|m)  <  NI.  However,  from  Corollary  4.1  and  the  invertibility  of  F{m) 
the  uniform  reachability  of  the  pair  (i^“^(m),  G{m)Q^ (m))  is  equivalent  to  the  orig¬ 
inal  system  being  uniformly  reachable.  Also,  the  grammian  {F~^{m),G{m)Q^{m)) 
being  bounded  above  is  equivalent  to  our  assumption  of  uniform  upper  bounded¬ 
ness  of  the  stochastic  reachability  grammian.  Thus,  under  uniform  reconstructibility 
and  the  uniform  upper  boundedness  of  the  stochastic  reconstructibility  grammian  of 
the  original  we  deduce  that  S*{m\m)  <  NI.  But  from  Lemma  5.2  we  know  that 
5*(m|m)  >  5'(m|m).  Thus,  ^(mlm)  <  NI.  But  from  eq.(5.21)  we  get 


i(P  ^(m|m)  -  C'^{m)R  ^{m)C{m)  +  ^("^))  <  NI 

(5.31) 

It  follows  straightforwardly  that 

p-^{m\m)  <  L-^I 

(5.32) 

where 

L-^I  >  2NI  +  C^{m)R-\m)C{m) 

(5.33) 

Thus, 

P{m\m)  >  LI 

(5.34) 

□ 

Using  analagous  arguments  we  can  derive  a  lower  bound  for  P^fL(^7^|m).  Note 
that  with  following  definitions  S*  obeys  equations  (5.23,5.24). 

5'*(m|m)  =  —  G’^{m)R~^{m)G{m))  (5.35) 

^■"(mlm  +  l)  =  F“^(m  +  l)P;J^^(m  +  l|m  +  l)F’“^(m -1- 1)  (5.36) 
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Using  the  same  argument  as  in  the  proof  of  Theorem  5.3  with  our  current  definitions 
for  S*  we  get  that 


i(P^i(m|m)  -  C^(m)ir\m)C(m})  <  NI 


for  iV  >  0.  Equivalently, 


for 


PAfi(m|m)  <  (H)  ^I 


{L')-'^I  >  2NI  +  C^(m)R-\m)Cim) 
Thus,  we  have  the  following  theorem. 


(5.37) 

(5.38) 

(5.39) 


Theorem  5.4  Given  uniform  upper  boundedness  of  the  stochastic  reachability  gram- 
mian,  i.e.  eq.(5.S),  and  given  uniform  reachability  of  the  system  there  exists  L'  >  0 
such  that  for  all  m  PML{'^\fn)  >  L'l. 
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6  Upward  Stability  on  Trees 

In  this  section  we  formalize  the  notion  of  stability  for  dynamic  systems  evolving  up 
the  tree.  The  dynamics  on  which  we  are  interested  in  focusing  the  major  portion  of 
our  analysis  are  the  ML  error  dynamics  of  eq.(3.28).  Thus  the  general  class  of  systems 
we  wish  to  study  here  has  the  form 

z(t)  =  +  l){z{at)  +  z{^t)]  +  Q{m{t))u{t)  (6.1) 

What  we  wish  to  do  is  to  study  the  asymptotic  stability  of  this  system  as  the  dynamics 
propagate  up  the  tree.  Since  we  are  interested  in  internal  stability,  we  will  consider 
the  autonomous  system  with  u  =  0. 

Intuitively  what  we  would  like  stability  to  mean  is  that  z{t)  — >  0  as  we  propagate 
farther  and  farther  away  from  the  initial  level  of  the  tree.  Note,  however,  that  as  we 
move  up  the  tree(or  equivalently  as  the  initial  level  moves  farther  down),  z{t)  is  influ¬ 
enced  by  a  geometrically  increasing  number  of  nodes  at  the  initial  level.  For  example, 
z{t)  depends  on  {z{at) ,  z{^i)}  or,  alternatively  on  {z{aH),z[^at),z{a^t),z[^^t)}  or, 
alternatively  on  {z{aH),  z{^aH),  z{a^at),  z{/3'^at)^  z(pP ^t),z{^a^i),  z{a^H),  z{j3H)'\^ 
etc.  Thus  in  order  to  study  asymptotic  stability  it  is  necessary  to  consider  an  infi¬ 
nite  dyadic  tree,  with  an  infinite  set  of  initial  conditions  corresponding  to  all  nodes 
at  the  initial  level.  Note  also,  that  we  might  expect  that  there  would  be  a  number 
of  meanings  we  could  give  to  “^(t)  0”  -  e.g.  do  we  consider  individual  nodes  at  a 

level  or  the  infinite  sequence  of  values  at  all  points  at  a  level? 

To  formalize  the  notion  of  stability  let  us  change  the  sense  of  our  index  of  recursion 
so  that  m  increases  as  we  move  up  the  tree.  Specifically,  we  arbitrarily  choose  a  level 
of  the  tree  to  be  our  “initial”  level,  i.e.  level  0,  and  we  index  the  points  on  this  initial 
level  as  Zj(0)  for  i  E  Z.  Points  at  the  mth  level  up  from  level  0  are  denoted  Zi{m)  for 
i  e  Z.  The  dynamical  equations  we  then  wish  to  consider  are  of  the  form 

Zi{m)  =  A{m  —  l){z2i{m  -  1)  +  Z2i+i{m  -  1))  (6.2) 

Let  Z{m)  denote  the  infinite  sequence  at  level  m,  i.e.  the  set  {zi{m)  ,  i  e  Z}. 
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The  p-norm  on  such  a  sequence  is  defined  as 

l|ZHIk  =  (Elki(™)ll?)'  (6-3) 

* 

where  ||2;i(TO)||p  is  the  standard  p-norm  for  the  finite  dimensional  vector  Zi{m). 

We  define  the  following  notion  of  exponential  stability  for  a  system. 

Definition  6.1  A  system  is  Ip- exponentially  stable  if  given  any  initial  sequence  Z(0) 
such  that  ||Z(0)||p  <  oo, 

||Z(m)||,  <  Ca“||Z(0)||,  (6.4) 

where  0  <  a  <  1  and  C  is  a  positive  constant. 


From  eq.(6.2)  we  can  easily  write  down  the  following. 


2ri(m)  =  $(m,0)  (®-^) 

where  the  cardinality  of  the  set  Om,i  is  2”^  and  for  mi  >  mg 


#(mi,m2)  = 


I  mi, m2 

.4(mi  —  l)#(mi  —  Ijmg)  mi  >  m2 


(6.6) 


As  in  the  case  of  standard  dynamic  systems  it  is  the  state  transition  matrix,  $(m,  0), 
which  plays  a  crucial  role  in  studying  stability  on  trees.  However,  unlike  the  standard 
case,  as  one  can  see  from  eq.(6.5),  the  nature  of  the  initial  condition  that  influences 
Zi(m)  depends  crucially  on  m;  in  particular  the  number  of  points  at  level  0  to  be 
summed  up  and  scaled  to  give  Zi{m)  is  2”^.  These  observations  lead  to  the  following: 


Theorem  6.1  The  system  defined  in  eq.(6.2)  is  Ip-exponentially  stable  if  and  only  if 

2'^|l$(m,0)||p  <  for  all  m  (6.7) 

where  0  <  7  <  1  and  K'  is  a  positive  constant. 


Proof 
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Let  us  first  show  necessity.  Specifically,  suppose  that  for  any  ii'>0,0<7<l, 
and  M  >  0  we  can  find  a  vector  z  and  an  m  >  M  so  that 


where 


|$(m,0)z||p  >  K'y'^2  »  \\z 


i  +  i  =  i 

P  <1 


Let  z  and  m  be  such  a  vector  and  integer  for  some  choice  of  if,  7,  and  M,  and  define 
an  initial  sequence  as  follows.  Let  Po,  pi^  P2,  ■■■  be  a  sequence  with 


Then  let 


Note  that 


i=0 

Poz  0  <  i  <  2”^ 

Piz  2^  <i  <2-2’' 


.(0)  =  <  : 


PiZ  i2™  <  i  <  {j  +  1)2^ 


(6.10) 


(6.11) 


Also,  note  that 


Thus, 


112(0)11? 


=  Elh(0)ll? 

i=0 

=  2”^||z|P 


(i+l)2’”-l 

Zi{m)  =  $(m,0)  ^  Zj(0) 

j=i2”‘ 

=  2’”/9,$(m,0)z 


l|2(m)li;  =  0)^11? 


=  2”’’A''’7"’’2^2“"‘||2(0)||? 

=  A:''7’"'’li2(0)||; 


(6.12) 


(6.13) 


(6.14) 
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where  the  first  equality  comes  from  eq.(6.10),  the  inequality  from  eq.(6.8),  the  next 
equality  from  eq.(6.12),  and  the  last  equality  from  eq.(6.9).  Hence  for  any  K,  0  < 
7  <  1  and  Af  >  0  we  can  find  an  initial  /p-sequence  Z(0)  and  an  m  >  M  so  that 

|lZ(m)i|,  >  ir7’”l|Z(0)|lr  (6-15) 

SO  that  the  system  cannot  be  /p-exponentially  stable. 

To  prove  sufficiency  we  use  the  following. 

Lemma  6.1  A  system  is  Ip- exponentially  stable  if  for  every  i 

||^i(m)||,  <  Y,  ||2i(0)||;)-  (6.16) 

where  0  <  ^  <1  and  K  is  a  positive  constant. 

Proof 

By  raising  both  sides  of  eq.(6.16)  to  the  pth  power  we  get 

Since  eq.(6.17)  holds  for  every  i  we  can  write 

E  <  K-iff'r  Y.  Iki(0)ll5  (6-18) 

i  i 

The  lemma  follows  from  raising  both  sides  of  eq.(6.18)  to  the  power  of  T 
□ 

Lemma  6.2  Consider  the  sequence  of  vectors  x,-  for  i  6  Z.  Then,  for  any  m  and 
any  j 

II  E  a:i||p<2^(  XI  ll^illP^  (6-19) 

where  Om,j  =  {j,j  +  1,  ■■■j  +  2"^  —  1}  and  q  satisfies  eq.(6.9). 


Proof 
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We  first  show  the  following. 

||a  +  %<2i(H;+||6K)‘  (6.20) 

Since  ||  •  ||p  is  a  convex  function,  we  can  write 

ll(i)<>  +  (1  -  <  ( j)||a||;  +  (1  -  l)||i||J  (6.21) 

from  which  eq.(6.20)  follows  immediately.  We  now  show  the  result  by  induction  on 
m.  Suppose  for  all  j 

H  E  E  (6-22) 

Consider  the  summing  Xi  over  the  two  sets  Omji  and  Omja  where  j2  —  ji  +  2"^.  From 
eq.(6.20)  we  get 

IK  E  ^.  +  ,E  *.)ll,<2i(||(  E  +  IK,  E  (6-23) 

Then  by  substituting  into  eq.(6.22)  eq.(6.23)  we  get 

II  ajillp  <2^()|(  «i||p+||(  XI  {6-24) 

□ 

We  can  now  show  sufficiency  thereby  completing  the  proof  of  the  theorem.  By 
applying  the  p-norm  to  eq.(6.5)  and  using  the  Cauchy-Schwarz  inequality  we  get 

lN(?^)||p  <  ll^(m,0)i|p||  X  ^i(0)llp  (6-25) 

j^Om,i 

Using  Lemma  6.2,  we  get 

IN(m)||p  <  ||$(m,0)||p2?(  X  lki(0)IIP^  (6-26) 

3^0m,i 


By  substituting  eq.(6.7)  into  eq.(6.26)  we  get 

ll^.(™)IK<is"7"(  E  ll^iWIiP' 

j&Om.i 


(6.27) 


7  FILTER  STABILITY 


39 


which  by  Lemma  6.1  shows  the  system  to  be  /p-exponentially  stable. 

□ 

Note  that  referring  to  eq.’s(6.2,6.5,6.6)  we  see  that  the  /p-exponential  stabihty  of 
eq.(6.2)  is  equivalent  to  the  usual  exponential  stability  of  the  system 

^(m)  =  2^yl(m  -  l)^(m  -  1)  (6.28) 

For  example  for  p  =  2,  we  are  interested  in  the  exponential  stability  of 

^(m)  =  \/2A{m  —  l)^(m  —  1)  (6.29) 

If  A  is  constant  this  is  equivalent  to  requiring  A  to  have  eigenvalues  with  magnitudes 

<  ^ 

^  2  • 

Note  also  that  it  is  straightforward  to  show  that  if  one  considers  the  system  with 
inputs  and  outputs 

Zi{m)  =  A{m  -  l)(z2i{m  -  1)  +  Z2i+i{m  -  1)) 

+  B{m  —  l)(u2i{m  -  1)  +  U2i+i(Tn  -  1))  (6.30) 

yi{m)  =  C{m)zi{m)  (6.31) 

then  if  B{m)  and  C{m)  are  bounded,  the  asymptotic  stability  of  the  undriven  dynamics 
imply  bounded-ihput/bounded-output  stability. 

7  Filter  Stability 

In  this  section  we  show  that  the  error  dynamics  of  the  maximum  likelihood  filter  are 
stable  and  also  that  the  same  is  true  of  the  overall  filter. 

Theorem  7.1  Suppose  that  the  system  is  uniformly  reachable  and  uniformly  re- 
constructible.  Then,  the  error  dynamics  of  the  maximum  likelihood  filter  are  I2- 
exponentially  stable. 


Proof 
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The  following  proof  follows  closely  the  standard  proof  for  stability  of  discrete-time 
Kalman  filters  given  in  [9],  Based  on  the  comments  at  the  end  of  the  preceding  section 
and  on  the  ML  error  dynamics  of  eq.(3.28),  we  see  that  we  wish  to  show  that  the 
following  causal  system  is  stable  in  the  standard  sense. 

z(m)  =  —  l)v^F(m  —  l)z(m  —  1)  (7.1) 

Theorem’s  5.2  and  5.4,  i.e.  the  upper  and  lower  bounds  on  allow  us  to 

define  the  following  Lyapunov  function. 

V(z,Tn)  =  z^(m)Fj^];^(m\m)z(m) 

Let  us  also  define  the  following  quantity. 

A 


(7.2) 


i(m)  =  \/2F{m  —  l)^(m  —  1) 


(7.3) 

=  PML{m\m  -  l)P;^\{m\m)z{m)  (7.4) 

Substituting  eq.(3.7)  into  eq.(7.2)  followed  by  algebraic  manipulations,  one  gets 

V{z,m)  =  z'^{m)(2P^\(m\m  —  1)  +  C’^(m)R~^(m)C(m))z(m)  (7.5) 

=  2z^  {m){P^\{m\m)  —  2Pj^}\^(m\m  —  l))5r(m)  —  2^(m)C'^(m)P“^(m)(7(m))z(m) 
-f  z'^ {m){2P^\{m\m  -  l))2;(m) 


z^(m)  ■,  , 


z{m)  z^{m)  _  1 


y/2  y/2 


PML{'rn\m  -  1) 


z(m) 


(7^6) 


=  —(y/2z(m) 


j(m). 


z(m) 


-  lX\/2z(m)  - 


) 


—  z^{m)C^{m)R  ^{m)C{m)z{m)  -{- 


^  D-l 


V2 

But  note  that  by  using  the  matrix  inversion  lemma  we  get 


PMLi^lm  -  1) 


K^) 

V2 


(7.7) 


V2 


V2 


V{z,m  —  1)  —  A 


A  >  0 


(7.8) 

(7.9) 


It  follows  that 

V{z,m)  -V{z,m-l)  <  -{y/2z{m)  -  PM\{m\m  -  l){^/2z{m)  - 

—  z^{m)C'^[m)R~^{m)C{m)z{rn)  (7.10) 
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Stability  follows  from  eq.(7.10)  under  the  condition  of  uniform  observability  of  the 
pair  (F(m),R~^(m))C(m)  which  by  Corollary  4.2  is  equivalent  to  uniform  recon- 
structibility  of  the  system. 

□ 

Let  us  now  examine  the  full  estimation  error  after  incorporating  prior  statistics. 
It  is  straightforward  to  see  that 

=  F(m(t)lm(t))(Fj^l(m(t)\m(t))xML(tlt)  +  P^"^(m(i))a:(t))  (7.11) 

Thus  we  can  view  x{t\t)  as  a  linear  combination  of  the  states  of  two  upward-evolving 
systems,  eq.(3.28)  for  and  one  for  F~^{m{t))x{t).  Note  first  that  since 

F{m\m)  < 

<  \\xML{t\t)\\  (7.12) 

and  we  already  have  the  stability  of  the  xml^^)  dynamics  from  Theorem  7.1.  Turning 
to  the  second  term  in  eq.(7.11),  note  first  that  thanks  to  Theorem  5.1,  F{m{t)\m{t)) 
is  bounded.  Note  also  that  the  covariance  of  F~^{m(t))x{t)  is  simply  F~^{m{t)). 
By  uniform  reachability  F~^{m{t))  is  bounded  above.  Thus,  while  Fx{m{t))  might 
diverge,  the  contribution  to  the  error  of  the  second  term  in  eq.(7.11)  is  bounded. 

Also,  our  previous  analysis  allows  us  to  conclude  that  the  full,  driven  5jv/i(t|t) 
dynamics  are  bounded-input,  bounded-output  stable  from  inputs  w  and  v  to  output 
XMLmt)-  If  we  use  eq.(3.19),  together  with  eq.(2.3)  and  eq.’s(2.6-2.8)  we  can  write 
down  the  following  upward  dynamics  for  ({t)  =  FT^{m{t))x{t): 

C{t)  =  +  + 

+  ^N{m{t)  -f  l)(w{at)  -b  (7-13) 

where 

N{m{t)  -f  1)  =  F~^{m{t))A~^{m{t)  +  l)B{m{t)  -f  1)  (7-14) 

Note  that  in  general  there  is  no  reason  to  constrain  the  autonomous  dynamics  of 
eq.(7.13)  to  be  stable.  However,  if  they  are  not,  then  reachability  implies  that 
F^{m)  oo  so  that  N{m)  0  and  the  covariance  of  tu  — ^  The  bounded-input, 
bounded-output  stability  of  this  system  can  be  easily  checked. 
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8  Steady-state  Filter 

In  this  section  we  study  properties  of  our  filter  under  steady-state  conditions;  i.e.  we 
analyze  the  asymptotic  properties  of  the  filter.  We  state  and  prove  several  results. 
First  we  show  that  the  error  covariance  of  the  ML  estimator  converges  to  a  steady- 
state  limit  and  that  furthermore,  the  steady-state  filter  is  /2- exponentially  stable. 


Theorem  8.1  Consider  the  following  system  defined  on  a  tree. 

xit)  =  iFix{at)  +  x{fit))  +  ^G{w{at)  +  w{fit))  (8.1) 

y{t)  =  Cx{t)  +  v{t)  (8.2) 

E[wit)w’^{t)]  =  Q  (8.3) 

E[v{t)v^  (t)]  =  R  (8-4) 


where  v(t)  is  white  and  w{t)  is  white  in  subtrees.  Suppose  that  {F,  GQ^)  is  a  reachable 
pair  and  (F,R~^C)  is  an  observable  pair.  The  error  covariance  for  the  ML  estimator, 
converges  as  m  oo  to  Poo,  which  is  the  unique  positive  definite  solution 
to 

Poo  =  ^FPooF'^  I-^GQG'^ 

-  K^{^CFPooF'^C'^  +  ^CGQG^C^  +  R)Kl  (8.5) 

where 

Koo  =  PooC'^R-^  (8.6) 

Moreover,  the  autonomous  dynamics  of  the  steady-state  ML  filter,  i.e. 

e(f)  =  i(/  -  KooC)F{e{at)  -H  e{l3t))  (8.7) 

are  /2-exponentially  stable. 

Proof 
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Recall  the  Riccati  equations  for  the  ML  estimator  where  the  scale  variable  m 
increases  in  the  direction  upward  along  the  tree. 

PML(m|m  +  l)  =  FPML{m-Il\m-\-l)F^ +  GQG'^ 

(8.8) 

PM\im\m)  =  2P^i(m|m  +  l)  +  C^R-'C'  (8.9) 

Convergence  of  PMi(^l^) 

In  order  to  show  the  existence  of  a  limit  of  Pml(^I^)  as  m  — oo  we  show  that 
both  a)  is  monotone- nonincreasing  in  m  and  b)  PMii'ml'm)  is  bounded 

below. 

a)  We  adopt  the  following  notation. 

P(m)  =  PMLi^\n^)  rn  >0  (8.10) 

P(m;m')  =  P{rn -- rn')  rn  >rn'  (8-11) 

By  the  scale-invariance  of  our  system  showing 

mi  <  m2 P{m-,mi)  <  P{m;m2)  (8.12) 

is  equivalent  to  demonstrating  that  P(m)  is  monotone-nonincreasing. 

We  note  that  eq.’s(8.8,8.9)  preserve  positive  definite  orderings;  i.e.  if  Pi  (m2)  < 


P2(m2)  then  Pi(m;m2)  <  P2(m;m2)  for  m  >  m2.  We  now  take 

Pj(m2)  =  P(m2;mi)  (8.13) 

P2(m2)  =  00  (initial  condition  for  the  ML  estimator)  (8-14) 

Then, 

Pi(m;m2)  =  P(m;mi)  (8.15) 

P2(m;m2)  =  P(m;m2)  (8.16) 


for  m  >  m2.  So  by  the  property  of  postitive  definite  ordering  of  the  Riccati  equations 
we  know  that 


Pi(m;m2)  <  Pi(m;m2) 


(8.17) 
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and  thus, 

P(m;  mi)  <  P(m;  m2)  (8.18) 

b)  The  fact  that  is  bounded  below  follows  from  Theorem  5.4  under 

our  assumptions  of  reachability  and  observability. 

Having  established  the  convergence  of  lei  us  denote  the  limit  as  fol¬ 

lows. 

lim  PMLi'm\m)  =  Poo  (8.19) 

m--*oo  VI/ 

Note  that  by  Theorem  5.4  Poo  must  be  positive  definite.  We  can  also  establish  that 
PML(»u.|m)  must  converge  to  the  solution  of  the  steady  state  Riccati  eq.(8.5).  Since 
PML('m\'m)  both  satisfies  the  Riccati  eq.’s(8.8,8.9)  and  converges  to  a  limit,  this  limit 
must  satisfy  the  fixed  point  equation  for  eq.’s(8.8,8.9).  This  fixed  point  equation  is 
precisely  the  steady  state  Riccati  eq.(8.5). 

Exponential  Stability  of  |(/  —  KooC)F 

In  order  for  |(/  —  KooC)F  to  be  l2-exponentially  stable,  it  must  have  eigenvalues 
that  are  strictly  less  than  This  fact  follows  from  Theorem  6.1. 

From  Theorem  7.1  we  know  that  the  following  system  is  exponentially  stable  with 
respect  to  ||  •  II2. 

z{t)  =  PML{rn{t)\m{t))P^\{m(t)\m{t)  -  l)iz(at)  +  z{/3t))  (8.20) 

which  can  be  rewritten  as 

z{t)  =  ^(J  -  K{m{t))C)F(z{at)  -f  z{^t))  (8.21) 

where 

K{m{t))  =  PML{m{t)\m{t))C'^R-^  (8.22) 

But,  since  lim„j_oo  ^Mi,(>^|»u)  =  Poo,  the  system  in  eq.(8.21)  in  steady-state  becomes 

2r(i)  =  1(7  -  K„C)F{z{at)  +  z{m  (8.23) 

Uniqueness  of  Pqq 
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Consider  Pi  and  P2,  both  of  which  satisfy  the  steady  state  Riccati  eq.(8.5).  Thus, 

F,  =  ifFiF^  +  jGljG’' 

-  Ki(^CFP,F'^C'^  +  ^CGQG^O^  +  R)Kl  (8.24) 

F2  =  iFPjF^+iGQG^ 

-  A'2(icFF2F’'C^+iGG(5G’'C^  +  F)ii'|'  (8.25) 

Subtracting  eq.(8.25)  from  eq.(8.24)  we  get 

F,-F2  =  ^(/-i5:.C)F(Fi-F2)(^(/-A-,G)F)"' 

+  A  (8.26) 

where  A  is  a  symmetric  matrix.  Note  that  we  have  established  the  fact  that  ^(/  — 
KiC)F  has  eigenvalues  within  the  unit  circle.  From  standard  system  theory  this  tells 
us  that  we  can  write  Pi  —  P2  as  a  sum  of  positive  semidefinite  terms.  This  implies  that 
Pi  —  P2  is  positive  semidefinite  or  Pi  >  P2.  By  subtracting  eq.(8.24)  from  eq.(8.25) 
and  using  the  same  argument  we  can  establish  that  P2  >  Pi. 

□ 

Note  that  the  preceding  analysis  assumed  constant  matrices  P,  G,C,Q,  and  R.  If 
we  begin  with  our  original  downward  model  eq.(2.2),  eq.(2.9)  with  A^  B^C,Q,  and 
R  invertible,  the  constancy  of  F,G,  and  Q  require  that  P~^  is  constant.  As  we 
are  interested  in  asymptotic  behavior,  there  is  no  loss  of  generality  in  assuming  this 
and  there  are  two  distinct  cases.  Specifically,  if  A  is  stable,  then  the  covariance 
P^(m{t))  at  all  finite  nodes(starting  from  an  infinitely  remote  coarse  level)  is  the 
positive  definite(because  of  reachability)  solution  P^  of  eq.(2.4),  and  in  this  case,  we 
have  that 

F(m|m)-^(F-'+F-')-'  (8.27) 

On  the  other  hand,  if  A  is  unstable,  P~^{m{t))  -4  0  and 


P(m|m)  -4  Poo 


(8.28) 
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Note  that  the  existence  of  two  distinct  limiting  forms  for  P(m|m),  depending  on  the 
stability  of  the  original  model  is  another  significant  deviation  from  standard  causal 
theory. 
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9  Summary 

In  this  paper  we  have  analyzed  in  detail  the  filtering  step  of  the  Rauch- Tung-Striebel 
smoothing  algorithm  developed  in  [7]  for  the  optimal  estimation  of  a  class  of  mul- 
tiresolution  stochastic  processes.  In  particular  we  have  developed  the  system-theoretic 
concepts  necessary  for  the  analysis  of  the  stability  and  the  steady-state  properties  of 
the  filter.  Notions  of  stability,  reachability,  and  observability  were  developed  for  sys¬ 
tems  whose  dynamics  evolve  upward  on  a  dyadic  tree.  We  then  used  these  notions  in 
showing  stability  of  the  optimal  filter  and  steady-state  convergence  of  the  filter. 
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We  define  the  following  quantities. 


(.29) 

^to 

=  ^{to)XM,to 

(.30) 

Hto) 

=  GI2M 

(.31) 

where  G  is  invertible(and  thus  $(to)  is  onto).  We  use  Af{-)  and  Tt{-)  to  denote 
nullspace  and  rangespace,  respectively.  A  system  is  upward-reconstructible  if  given 
YM,to,  ®to  is  uniquely  determined,  i.e.  C  A/’($(to))-  We  first  prove  the 

following  lemma. 


Lemma  .1  For  all  M 


where 


and  A  is  some  matrix. 


A  =  diag(  A^^^  ) 
2^  times 


(.32) 

(.33) 


Proof 

The  structure  of  which  we  denoted  as  is  described  in  a  recursive 

fashion  in  eq.’s(4.20-4.24).  We  compute 

■  U{M,  +  2^-ir(M,  0)G^/2m-i  1 


By  repeating  this  procedure  M  —  1  more  times  we  get 

t/(M,0)#^(fo)  =  A$^(to) 

(.36) 

where 

A  =  diag(  A;^^  ) 

(.36) 

2^  times 
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M-l 

A  =  ^  i)  +  U(M,  M)  (.37) 

t=0 

□ 

We  prove  the  following  theorem. 

Theorem  .1  M{Hm)  C  W'($(to))  iff  {to)  is  invertible. 

Proof 

a) 

M{Hm)  C  W(#(io))  — >  ^{to)'Hli'HM^^{to)  is  invertible 

Assume  ^{to)'U^'HM^^ {to)  is  not  invertible.  Then  for  some  y  ^  0,  y'^ ^{to)7{^'HM^'^ {to)y 
0.  This  implies  HM^^{to)y  =  0.  But  the  fact  that  $(to)  is  onto  implies  #^(to)j/  7^ 

0.  Furthermore,  i=-  0  implies  ^{to)^^{to)y  ^  0  since  if  it  were  true  that 

^{to)^^{to)y  =  0,  then  y^ ^{to)^^ {to)y  =  0,  which  implies  ^^{to)y  =  0.  Thus,  there 
exists  a.  z  ^  0,  namely  ^^{to)y,  such  that  T-Imz  =  0  and  #(to)  ^  0;  i.e.  it  is  not  true 
that  N{nM)  C  Af{^{to)). 

b) 

^{to)'HM'HM^^{to)  is  invertible  — >•  M{Hm)  C  A/”(#(^o)) 

Assume  that  J\f{'HM)  C  W($(to))  is  false;  i.e.  there  exists  an  x  such  that  'Hm^  =  0 
and  7^  0.  Since  x  G  77.($^(to))  0A/’($(io)),  we  can  write  x  =  + 

^jy($(to))  where  is  non-zero  and  a;Ar($(to))  may  or  may  not  be  non-zero. 

Since  Ti-MX  =  0,  'HMXn{iiT{to))  +  'HMXjsf{^{to))  =  0,  which  means  that  HM^^{to)y  + 

=  0  for  some  t/  7^  0.  Left  multiplying  by  we  get 

^{to)'H^'HM^^  {to)y  +  ^{to)'H^'HM^M{^{to))  =  0  (-38) 

But  from  Lemma  .1  and  our  definition  for  $(^o),  we  get 

^{to)HltHM  =  '^{to)^^  =  GK^[  ] 

2^  times 


(.39) 
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By  substituting  (.39)  into  (.38),  we  get 

Hto)nlnM^^{to)y  +  GA^[  ]xAr($(to))  =  0  (.40) 

2^  times 

for  y  ^  Q.  But  a;^-($(<o))  €  Af{^{to))  implies  that  $(to)a;Ar(#(<o))  =  0  or,  using 
the  definition  of  $(to),  G{  ]®jV('»(<o))  ==  0-  ^rit  since  G  is  invertible,  then 

2^  times 

[  ]®A/’(#(to))  =  0.  Thus,  eq.(.40)  collapses  to  $(<o)'Wm'Hm$^(^o)?/  =  0  for  some 

2^^  ijimcs 

y  0,  implying  that  i/^$(to)'H|^WM^^(io)?/  =  0 for  some?/  ^  0;  i.e.  {ia) 

is  not  invertible. 

□ 


